CPSC 521 Assignment 8

نویسنده

  • Brad Bingham
چکیده

In 1998, Brin and Page [2] identified the need for an Internet search engine capable of scaling to a huge size. They foresaw the explosion of the number of web pages to continue, making human maintained lists of web pages more expensive and less effective. They explain that with this increase in web pages comes an increase in matching pages per query, and thus the ranking system must be more accurate to ensure quality matches in the top 10 or 20 returned. Also increasing is the number of search queries which suggests a high throughput is needed. With these points in mind, they design data structures to be distributed and scale gracefully. However, they note that while technology trends have CPU speed and disk size increasing rapidly, disk seek time and OS robustness may not improve as quickly and become concerns. The use and economy of Google clusters is described in [1]. Queries parallelize easily, with each running on many processors that each search a part of the index. Clusters are located worldwide where search queries serviced by a geographically nearby cluster with available computing power. Once a query is sent to a particular cluster, index server computers map the query words to actual documents generating a hit list data structure, described in [2]. The search index is tens of terabytes distributed randomly into index shards local to index servers. The list of documents returned is in the form of docids, a unique web page identifier. This is sent to a group of document server computers, each searching its own randomly distributed set of web pages. Document servers return the web page title and the keyword-in-context snippet as seen in Google search results. This is sent to one of many cluster web servers where the HTML for search results is generated and returned to the user’s web browser. These clusters are build from commodity x86 PCs connected with Ethernet switches. At of the time of publication of [1], these ranged from 533 MHz Celerons to dual 1.4 GHz Pentium IIIs each with one or more local IDE hard drives. They argue the economy of a cluster built from this commodity equipment. An example is given where a commodity cluster costs three times less, has 22 times the processors, has three times as much RAM and slightly less disc space than a highly integrated multiprocessor server. Of course, the high-end server is more reliable and has lower communication latency, but these are not critical to the cluster. Network traffic is expected to be low due to the highly parallelized application, and failures are handled by a fault tolerance scheme based on replication [4]. Failed equipment such as disk drives and power supplies are batched and replaced regularly. Though highly integrated multiprocessors may not be economically viable, they do give performance gains. In [1] they explain that their application does not exhibit much instruction level parallelism. In fact, they state that the 4 instruction issue Pentium 4 gives worse performance than the 3 instruction issue Pentium III. Parallelism is realized at the thread level where it is trivially exploited. This points to simultaneous multithreading (SMT) and chip multiprocessors (CMP) for performance gains. SMT has obvious benefit here since threads likely spend a long time stalled on cache misses, disk seeks and mispredicted data-dependent branches. Indeed, they claim that performance is increased by 30 percent on a dual-context SMT. CMPs could yield an even greater performance gain. Systems like the Piranha, or more recently, the Niagra [5] are designed for execution of multiple threads that block frequently. They use chip space to give Google’s application exactly what it needs: multiple processors with multithreading, short, inorder pipelines and a large L2 cache. However, the cost of something like the Niagara chip could outweigh its benefit. Google uses clusters for tasks other than performing web searches, such as building the search index data structure (called the inverted index). While these tasks are easily parallelized, there are the issues of how to distribute data and deal with failures. Ideally, as much information as possible about the cluster execution should be abstracted from the programmer. A method of achieving this abstraction is MapReduce [3]. This technique can be used for computations where the input and output are key/value pairs. The user writes a map function that computes intermediate key/value pairs from input key/value pairs. Pairs with the same intermediate key are grouped and sent to a reduce function (also

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

0 - 521 - 51851 - 2 - Stochastic Scheduling

absolute deviation, 15, 29 absolute deviation robust schedule, 27 Absolute Deviation Robust Scheduling Problem, 13, 17, 27 absolute robust single-machine sequence, 16 activity duration, 145, 151 activity network, 143, 146, 154 activity-on-arc network, 143, 149–151 activity-on-node network, 144, 148 agile, 8 agility, 1 almost surely smallest sequence, 8 analytical expression, 94 approximation al...

متن کامل

CPSC 521 Assignment 9

Cellular automata (CAs) are inherently parallel as the cells evolve in parallel with inter-cell communication restricted to a local neighborhood. The fish and sharks problem from assignment 4 is an example of a 2D CA. In this assignment, we focus on 1D CA where each cell is in one of two states (0 or 1) and its value the next time step depends on the current cell value and the right and left ne...

متن کامل

[Comparative study of Acid extraction tests of metal products containing lead].

The international standard ISO 8124-3: 1997 "Safety of toys -Part 3: Migration of certain elements" and "Interim Enforcement Policy for Children's Metal Jewelry Containing Lead- 2/3/2005" by the U.S. Consumer Product Safety Commission (CPSC) to control the amount of eluted lead from metal accessories cannot be simply compared, because the acid extraction methods and the limit values are differe...

متن کامل

Parallel Computation of High Dimensional Robust Correlation and Covariance Matrices Using Quadratic Correlation method. CPSC 521 Parallel Algorithms and Architecture: Project Report

The computation of covariance and correlation matrices is critical to many data mining applications and processes. Unfortunately the classical covariance and correlation methods are very sensitive to outliers. Robust methods, such as Quadratic Correlation (QC) and Maronna method, have been proposed. However, the existing algorithm for QC only gives acceptable performance when the dimensionality...

متن کامل

Chronic Health Effects of selected Flame Retardant Chemicals 6 Incidence of Selected Histopathological Effects in Sprague-Dawley Rats

The U.S. Consumer Product Safety Commission (CPSC) staff developed a draft performance. standard to address the hazards associated with fires involving residential upholstered furniture. Manufacturers are likely to treat some products with flame retardant (FR) chemicals if the draft standard is adopted. The CPSC staff previously assessed the potential health risks associated with the use of FR ...

متن کامل

Topology of Streptococcus pneumoniae CpsC, a polysaccharide copolymerase and bacterial protein tyrosine kinase adaptor protein.

In Gram-positive bacteria, tyrosine kinases are split into two proteins, the cytoplasmic tyrosine kinase and a transmembrane adaptor protein. In Streptococcus pneumoniae, this transmembrane adaptor is CpsC, with the C terminus of CpsC critical for interaction and subsequent tyrosine kinase activity of CpsD. Topology predictions suggest that CpsC has two transmembrane domains, with the N and C t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006